Search CORE

6 research outputs found

Classification of microarrays; synergistic effects between normalization, gene selection and machine learning

Author: Freyhult Eva
Hvidsten Torgeir R
Landfors Mattias
Rydén Patrik
Önskog Jenny
Publication venue: BioMed Central
Publication date: 01/01/2011
Field of study

Abstract Background Machine learning is a powerful approach for describing and predicting classes in microarray data. Although several comparative studies have investigated the relative performance of various machine learning methods, these often do not account for the fact that performance (e.g. error rate) is a result of a series of analysis steps of which the most important are data normalization, gene selection and machine learning. Results In this study, we used seven previously published cancer-related microarray data sets to compare the effects on classification performance of five normalization methods, three gene selection methods with 21 different numbers of selected genes and eight machine learning methods. Performance in term of error rate was rigorously estimated by repeatedly employing a double cross validation approach. Since performance varies greatly between data sets, we devised an analysis method that first compares methods within individual data sets and then visualizes the comparisons across data sets. We discovered both well performing individual methods and synergies between different methods. Conclusion Support Vector Machines with a radial basis kernel, linear kernel or polynomial kernel of degree 2 all performed consistently well across data sets. We show that there is a synergistic relationship between these methods and gene selection based on the T-test and the selection of a relatively high number of genes. Also, we find that these methods benefit significantly from using normalized data, although it is hard to draw general conclusions about the relative performance of different normalization procedures.</p

Crossref

Springer - Publisher Connector

Directory of Open Access Journals

Publikationer från Uppsala Universitet

PubMed Central

Digitala Vetenskapliga Arkivet - Academic Archive On-line

Populus tremula (European aspen) shows no evidence of sexual dimorphism

Author: A Boeckler
A Dobin
A Petzold
A Sjödin
AM Bolger
B Kersten
B Pakull
B Pakull
Bastian Schiffthaler
Benedicte R Albrectsen
C Ainsworth
C Soneson
CJ Tsai
D Charlesworth
D Goodstein
D Lloyd
DL Rowland
DT Lester
E Kopylova
F Chen
F Pedregosa
G Smyth
GA Tuskan
GA Tuskan
H Jiang
H Zhao
HD Bradshaw
I Paolucci
ID Pulford
J Hjältén
J Obeso
J Parsch
J Zluvova
J Önskog
JB Mitton
Jenny Önskog
JSP Heslop-Harrison
K Robinson
Kathryn M Robinson
L Li
LS Dudley
M Bylesjö
M Stevens
ME Griffith
MJ Anderson
Nathaniel R Street
Nicolas Delhomme
Niklas Mähler
NR Street
O Wilkins
P Dixon
P Price
PK Diggle
Pär K Ingvarsson
R Shine
RC Gentleman
RE Farmer
RH Whittaker
S Anders
S Chang
S Jansson
S Jing
S Zhang
SD Wullschleger
SS Pauley
Stefan Jansson
T Cornelissen
T Latva-Karjanmaa
T Osier
T Williams
T Yin
TG Whitham
TK Boes
Torgeir R Hvidsten
TR Randriamanana
V Luquez
W Boecklen
WJ Boecklen
X Wang
X Xu
X Xu
Y Fracheboud
Publication venue: 'Springer Science and Business Media LLC'
Publication date
Field of study

Crossref

Challenges in microarray class discovery : a comprehensive examination of normalization, gene selection and clustering

Author: Freyhult Eva
Hvidsten Torgeir R.
Landfors Mattias
Rydén Patrik
Önskog Jenny
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 01/01/2010
Field of study

Background: Cluster analysis, and in particular hierarchical clustering, is widely used to extract information from gene expression data. The aim is to discover new classes, or sub-classes, of either individuals or genes. Performing a cluster analysis commonly involve decisions on how to; handle missing values, standardize the data and select genes. In addition, pre processing, involving various types of filtration and normalization procedures, can have an effect on the ability to discover biologically relevant classes. Here we consider cluster analysis in a broad sense and perform a comprehensive evaluation that covers several aspects of cluster analyses, including normalization. Result: We evaluated 2780 cluster analysis methods on seven publicly available 2-channel microarray data sets with common reference designs. Each cluster analysis method differed in data normalization (5 normalizations were considered), missing value imputation (2), standardization of data (2), gene selection (19) or clustering method (11). The cluster analyses are evaluated using known classes, such as cancer types, and the adjusted Rand index. The performances of the different analyses vary between the data sets and it is difficult to give general recommendations. However, normalization, gene selection and clustering method are all variables that have a significant impact on the performance. In particular, gene selection is important and it is generally necessary to include a relatively large number of genes in order to get good performance. Selecting genes with high standard deviation or using principal component analysis are shown to be the preferred gene selection methods. Hierarchical clustering using Ward's method, k-means clustering and Mclust are the clustering methods considered in this paper that achieves the highest adjusted Rand. Normalization can have a significant positive impact on the ability to cluster individuals, and there are indications that background correction is preferable, in particular if the gene selection is successful. However, this is an area that needs to be studied further in order to draw any general conclusions. Conclusions: The choice of cluster analysis, and in particular gene selection, has a large impact on the ability to cluster individuals correctly based on expression profiles. Normalization has a positive effect, but the relative performance of different normalizations is an area that needs more research. In summary, although clustering, gene selection and normalization are considered standard methods in bioinformatics, our comprehensive analysis shows that selecting the right methods, and the right combinations of methods, is far from trivial and that much is still unexplored in what is considered to be the most basic analysis of genomic data

Publikationer från Umeå universitet

Springer - Publisher Connector

PubMed Central

Digitala Vetenskapliga Arkivet - Academic Archive On-line

Populus tremula (European aspen) shows no evidence of sexual dimorphism

Author: Albrectsen Benedicte
Delhomme Nicolas
Hvidsten Torgeir
Ingvarsson Pär
Jansson Stefan
Mahler Niklas
Robinson Kathryn
Schiffthaler Bastian
Street Nathaniel
Önskog Jenny
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 01/01/2014
Field of study

Background: Evolutionary theory suggests that males and females may evolve sexually dimorphic phenotypic and biochemical traits concordant with each sex having different optimal strategies of resource investment to maximise reproductive success and fitness. Such sexual dimorphism would result in sex biased gene expression patterns in non-floral organs for autosomal genes associated with the control and development of such phenotypic traits. Results: We examined morphological, biochemical and herbivory traits to test for sexually dimorphic resource allocation strategies within collections of sexually mature and immature Populus tremula (European aspen) trees. In addition we profiled gene expression in mature leaves of sexually mature wild trees using whole-genome oligonucleotide microarrays and RNA-Sequencing. Conclusions: We found no evidence of sexual dimorphism or differential resource investment strategies between males and females in either sexually immature or mature trees. Similarly, single-gene differential expression and machine learning approaches revealed no evidence of large-scale sex biased gene expression. However, two significantly differentially expressed genes were identified from the RNA-Seq data, one of which is a robust diagnostic marker of sex in P. tremula

Publikationer från Umeå universitet

Populus tremula (European aspen) shows no evidence of sexual dimorphism

Author: Albrectsen Benedicte Rieber
Delhomme Nicolas
Hvidsten Torgeir Rhoden
Ingvarsson Pär K.
Jansson Stefan
Mähler Niklas
Robinson Kathryn M
Schiffthaler Bastian
Street Nathaniel
Önskog Jenny
Publication venue
Publication date: 01/01/2014
Field of study

Brage NMBU

Publikationer från Umeå universitet

Springer - Publisher Connector

Copenhagen University Research Information System

PubMed Central

Digitala Vetenskapliga Arkivet - Academic Archive On-line

NORA - Norwegian Open Research Archives